Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 17290 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 4 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 2.5 MiB |
| Average record size in memory | 152.0 B |
Variable types
| NUM | 18 |
|---|---|
| BOOL | 1 |
| Dataset has 4 (< 0.1%) duplicate rows | Duplicates |
view has 15594 (90.2%) zeros | Zeros |
sqft_basement has 10524 (60.9%) zeros | Zeros |
yr_renovated has 16573 (95.9%) zeros | Zeros |
Reproduction
| Analysis started | 2020-11-23 16:27:45.564969 |
|---|---|
| Analysis finished | 2020-11-23 16:28:37.499511 |
| Duration | 51.93 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
price
Real number (ℝ≥0)
| Distinct | 3536 |
|---|---|
| Distinct (%) | 20.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 539524.7534 |
|---|---|
| Minimum | 75000 |
| Maximum | 7700000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 75000 |
|---|---|
| 5-th percentile | 210000 |
| Q1 | 322000 |
| median | 450000 |
| Q3 | 645000 |
| 95-th percentile | 1150000 |
| Maximum | 7700000 |
| Range | 7625000 |
| Interquartile range (IQR) | 323000 |
Descriptive statistics
| Standard deviation | 367129.5559 |
|---|---|
| Coefficient of variation (CV) | 0.6804684189 |
| Kurtosis | 34.31837739 |
| Mean | 539524.7534 |
| Median Absolute Deviation (MAD) | 150000 |
| Skewness | 4.046741542 |
| Sum | 9328382986 |
| Variance | 1.347841108e+11 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 450000 | 149 | 0.9% | |
| 350000 | 137 | 0.8% | |
| 500000 | 123 | 0.7% | |
| 425000 | 120 | 0.7% | |
| 550000 | 120 | 0.7% | |
| 325000 | 115 | 0.7% | |
| 400000 | 113 | 0.7% | |
| 375000 | 110 | 0.6% | |
| 250000 | 105 | 0.6% | |
| 300000 | 103 | 0.6% | |
| Other values (3526) | 16095 | 93.1% |
| Value | Count | Frequency (%) | |
| 75000 | 1 | < 0.1% | |
| 78000 | 1 | < 0.1% | |
| 80000 | 1 | < 0.1% | |
| 81000 | 1 | < 0.1% | |
| 82000 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 7700000 | 1 | < 0.1% | |
| 6885000 | 1 | < 0.1% | |
| 5570000 | 1 | < 0.1% | |
| 5350000 | 1 | < 0.1% | |
| 5300000 | 1 | < 0.1% |
bedrooms
Real number (ℝ≥0)
| Distinct | 13 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.364777328 |
|---|---|
| Minimum | 0 |
| Maximum | 33 |
| Zeros | 12 |
| Zeros (%) | 0.1% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 33 |
| Range | 33 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9345629525 |
|---|---|
| Coefficient of variation (CV) | 0.277748826 |
| Kurtosis | 59.69661587 |
| Mean | 3.364777328 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.291539103 |
| Sum | 58177 |
| Variance | 0.8734079121 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3 | 7888 | 45.6% | |
| 4 | 5471 | 31.6% | |
| 2 | 2222 | 12.9% | |
| 5 | 1271 | 7.4% | |
| 6 | 210 | 1.2% | |
| 1 | 166 | 1.0% | |
| 7 | 30 | 0.2% | |
| 8 | 13 | 0.1% | |
| 0 | 12 | 0.1% | |
| 9 | 4 | < 0.1% | |
| Other values (3) | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 12 | 0.1% | |
| 1 | 166 | 1.0% | |
| 2 | 2222 | 12.9% | |
| 3 | 7888 | 45.6% | |
| 4 | 5471 | 31.6% |
| Value | Count | Frequency (%) | |
| 33 | 1 | < 0.1% | |
| 11 | 1 | < 0.1% | |
| 10 | 1 | < 0.1% | |
| 9 | 4 | < 0.1% | |
| 8 | 13 | 0.1% |
bathrooms
Real number (ℝ≥0)
| Distinct | 29 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.110598612 |
|---|---|
| Minimum | 0 |
| Maximum | 8 |
| Zeros | 9 |
| Zeros (%) | 0.1% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1.5 |
| median | 2.25 |
| Q3 | 2.5 |
| 95-th percentile | 3.5 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.7691985407 |
|---|---|
| Coefficient of variation (CV) | 0.3644456774 |
| Kurtosis | 1.275629087 |
| Mean | 2.110598612 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.501743885 |
| Sum | 36492.25 |
| Variance | 0.5916663951 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2.5 | 4355 | 25.2% | |
| 1 | 3111 | 18.0% | |
| 1.75 | 2444 | 14.1% | |
| 2.25 | 1615 | 9.3% | |
| 2 | 1528 | 8.8% | |
| 1.5 | 1153 | 6.7% | |
| 2.75 | 915 | 5.3% | |
| 3 | 588 | 3.4% | |
| 3.5 | 583 | 3.4% | |
| 3.25 | 487 | 2.8% | |
| Other values (19) | 511 | 3.0% |
| Value | Count | Frequency (%) | |
| 0 | 9 | 0.1% | |
| 0.5 | 3 | < 0.1% | |
| 0.75 | 61 | 0.4% | |
| 1 | 3111 | 18.0% | |
| 1.25 | 8 | < 0.1% |
| Value | Count | Frequency (%) | |
| 8 | 2 | < 0.1% | |
| 7.75 | 1 | < 0.1% | |
| 7.5 | 1 | < 0.1% | |
| 6.5 | 1 | < 0.1% | |
| 6.25 | 1 | < 0.1% |
sqft_living
Real number (ℝ≥0)
| Distinct | 927 |
|---|---|
| Distinct (%) | 5.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2074.043378 |
|---|---|
| Minimum | 290 |
| Maximum | 13540 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 290 |
|---|---|
| 5-th percentile | 930 |
| Q1 | 1420 |
| median | 1910 |
| Q3 | 2540 |
| 95-th percentile | 3760 |
| Maximum | 13540 |
| Range | 13250 |
| Interquartile range (IQR) | 1120 |
Descriptive statistics
| Standard deviation | 912.9222551 |
|---|---|
| Coefficient of variation (CV) | 0.4401654589 |
| Kurtosis | 5.326136058 |
| Mean | 2074.043378 |
| Median Absolute Deviation (MAD) | 550 |
| Skewness | 1.456396663 |
| Sum | 35860210 |
| Variance | 833427.0439 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1400 | 118 | 0.7% | |
| 1660 | 110 | 0.6% | |
| 1300 | 109 | 0.6% | |
| 1800 | 106 | 0.6% | |
| 1480 | 106 | 0.6% | |
| 1010 | 104 | 0.6% | |
| 1820 | 101 | 0.6% | |
| 1720 | 101 | 0.6% | |
| 1440 | 100 | 0.6% | |
| 1250 | 100 | 0.6% | |
| Other values (917) | 16235 | 93.9% |
| Value | Count | Frequency (%) | |
| 290 | 1 | < 0.1% | |
| 370 | 1 | < 0.1% | |
| 384 | 1 | < 0.1% | |
| 390 | 2 | < 0.1% | |
| 410 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 13540 | 1 | < 0.1% | |
| 12050 | 1 | < 0.1% | |
| 9890 | 1 | < 0.1% | |
| 9200 | 1 | < 0.1% | |
| 8670 | 1 | < 0.1% |
sqft_lot
Real number (ℝ≥0)
| Distinct | 8452 |
|---|---|
| Distinct (%) | 48.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15195.02707 |
|---|---|
| Minimum | 520 |
| Maximum | 1651359 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 520 |
|---|---|
| 5-th percentile | 1793 |
| Q1 | 5030.75 |
| median | 7590 |
| Q3 | 10676.5 |
| 95-th percentile | 43327.95 |
| Maximum | 1651359 |
| Range | 1650839 |
| Interquartile range (IQR) | 5645.75 |
Descriptive statistics
| Standard deviation | 42821.05179 |
|---|---|
| Coefficient of variation (CV) | 2.818096447 |
| Kurtosis | 299.8414412 |
| Mean | 15195.02707 |
| Median Absolute Deviation (MAD) | 2610 |
| Skewness | 13.54738572 |
| Sum | 262722018 |
| Variance | 1833642476 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5000 | 292 | 1.7% | |
| 6000 | 240 | 1.4% | |
| 4000 | 207 | 1.2% | |
| 7200 | 174 | 1.0% | |
| 4500 | 94 | 0.5% | |
| 7500 | 90 | 0.5% | |
| 4800 | 87 | 0.5% | |
| 9600 | 86 | 0.5% | |
| 8400 | 85 | 0.5% | |
| 3600 | 83 | 0.5% | |
| Other values (8442) | 15852 | 91.7% |
| Value | Count | Frequency (%) | |
| 520 | 1 | < 0.1% | |
| 572 | 1 | < 0.1% | |
| 609 | 1 | < 0.1% | |
| 638 | 1 | < 0.1% | |
| 649 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1651359 | 1 | < 0.1% | |
| 1164794 | 1 | < 0.1% | |
| 1074218 | 1 | < 0.1% | |
| 1024068 | 1 | < 0.1% | |
| 982998 | 1 | < 0.1% |
floors
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.49534413 |
|---|---|
| Minimum | 1 |
| Maximum | 3.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1.5 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 3.5 |
| Range | 2.5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.5406370881 |
|---|---|
| Coefficient of variation (CV) | 0.3615469359 |
| Kurtosis | -0.4813253064 |
| Mean | 1.49534413 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.616994643 |
| Sum | 25854.5 |
| Variance | 0.292288461 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 8529 | 49.3% | |
| 2 | 6589 | 38.1% | |
| 1.5 | 1537 | 8.9% | |
| 3 | 497 | 2.9% | |
| 2.5 | 132 | 0.8% | |
| 3.5 | 6 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 8529 | 49.3% | |
| 1.5 | 1537 | 8.9% | |
| 2 | 6589 | 38.1% | |
| 2.5 | 132 | 0.8% | |
| 3 | 497 | 2.9% |
| Value | Count | Frequency (%) | |
| 3.5 | 6 | < 0.1% | |
| 3 | 497 | 2.9% | |
| 2.5 | 132 | 0.8% | |
| 2 | 6589 | 38.1% | |
| 1.5 | 1537 | 8.9% |
waterfront
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 135.1 KiB |
| 0 | |
|---|---|
| 1 | 138 |
| Value | Count | Frequency (%) | |
| 0 | 17152 | 99.2% | |
| 1 | 138 | 0.8% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2346443031 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 15594 |
| Zeros (%) | 90.2% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.7680595297 |
|---|---|
| Coefficient of variation (CV) | 3.273292893 |
| Kurtosis | 10.92832371 |
| Mean | 0.2346443031 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.399929342 |
| Sum | 4057 |
| Variance | 0.5899154412 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 15594 | 90.2% | |
| 2 | 778 | 4.5% | |
| 3 | 397 | 2.3% | |
| 4 | 263 | 1.5% | |
| 1 | 258 | 1.5% |
| Value | Count | Frequency (%) | |
| 0 | 15594 | 90.2% | |
| 1 | 258 | 1.5% | |
| 2 | 778 | 4.5% | |
| 3 | 397 | 2.3% | |
| 4 | 263 | 1.5% |
| Value | Count | Frequency (%) | |
| 4 | 263 | 1.5% | |
| 3 | 397 | 2.3% | |
| 2 | 778 | 4.5% | |
| 1 | 258 | 1.5% | |
| 0 | 15594 | 90.2% |
condition
Real number (ℝ≥0)
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.411220359 |
|---|---|
| Minimum | 1 |
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 3 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.6521153241 |
|---|---|
| Coefficient of variation (CV) | 0.191167751 |
| Kurtosis | 0.5014409231 |
| Mean | 3.411220359 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.039964803 |
| Sum | 58980 |
| Variance | 0.4252543959 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3 | 11220 | 64.9% | |
| 4 | 4531 | 26.2% | |
| 5 | 1380 | 8.0% | |
| 2 | 137 | 0.8% | |
| 1 | 22 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 22 | 0.1% | |
| 2 | 137 | 0.8% | |
| 3 | 11220 | 64.9% | |
| 4 | 4531 | 26.2% | |
| 5 | 1380 | 8.0% |
| Value | Count | Frequency (%) | |
| 5 | 1380 | 8.0% | |
| 4 | 4531 | 26.2% | |
| 3 | 11220 | 64.9% | |
| 2 | 137 | 0.8% | |
| 1 | 22 | 0.1% |
grade
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.654829381 |
|---|---|
| Minimum | 1 |
| Maximum | 13 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 7 |
| median | 7 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 13 |
| Range | 12 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.180167091 |
|---|---|
| Coefficient of variation (CV) | 0.15417288 |
| Kurtosis | 1.226700173 |
| Mean | 7.654829381 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.7723409084 |
| Sum | 132352 |
| Variance | 1.392794363 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 7178 | 41.5% | |
| 8 | 4856 | 28.1% | |
| 9 | 2056 | 11.9% | |
| 6 | 1643 | 9.5% | |
| 10 | 929 | 5.4% | |
| 11 | 313 | 1.8% | |
| 5 | 201 | 1.2% | |
| 12 | 75 | 0.4% | |
| 4 | 23 | 0.1% | |
| 13 | 12 | 0.1% | |
| Other values (2) | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 3 | 3 | < 0.1% | |
| 4 | 23 | 0.1% | |
| 5 | 201 | 1.2% | |
| 6 | 1643 | 9.5% |
| Value | Count | Frequency (%) | |
| 13 | 12 | 0.1% | |
| 12 | 75 | 0.4% | |
| 11 | 313 | 1.8% | |
| 10 | 929 | 5.4% | |
| 9 | 2056 | 11.9% |
sqft_above
Real number (ℝ≥0)
| Distinct | 844 |
|---|---|
| Distinct (%) | 4.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1784.347715 |
|---|---|
| Minimum | 290 |
| Maximum | 9410 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 290 |
|---|---|
| 5-th percentile | 850 |
| Q1 | 1190 |
| median | 1560 |
| Q3 | 2210 |
| 95-th percentile | 3375.5 |
| Maximum | 9410 |
| Range | 9120 |
| Interquartile range (IQR) | 1020 |
Descriptive statistics
| Standard deviation | 822.5892842 |
|---|---|
| Coefficient of variation (CV) | 0.4610027951 |
| Kurtosis | 3.262076156 |
| Mean | 1784.347715 |
| Median Absolute Deviation (MAD) | 460 |
| Skewness | 1.412496269 |
| Sum | 30851372 |
| Variance | 676653.1305 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1010 | 170 | 1.0% | |
| 1300 | 167 | 1.0% | |
| 1200 | 163 | 0.9% | |
| 1220 | 153 | 0.9% | |
| 1060 | 147 | 0.9% | |
| 1400 | 147 | 0.9% | |
| 1340 | 145 | 0.8% | |
| 1140 | 143 | 0.8% | |
| 1250 | 141 | 0.8% | |
| 1320 | 138 | 0.8% | |
| Other values (834) | 15776 | 91.2% |
| Value | Count | Frequency (%) | |
| 290 | 1 | < 0.1% | |
| 370 | 1 | < 0.1% | |
| 384 | 1 | < 0.1% | |
| 390 | 2 | < 0.1% | |
| 410 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9410 | 1 | < 0.1% | |
| 8860 | 1 | < 0.1% | |
| 8570 | 1 | < 0.1% | |
| 7880 | 1 | < 0.1% | |
| 7850 | 1 | < 0.1% |
| Distinct | 284 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 289.6956622 |
|---|---|
| Minimum | 0 |
| Maximum | 4130 |
| Zeros | 10524 |
| Zeros (%) | 60.9% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 560 |
| 95-th percentile | 1190 |
| Maximum | 4130 |
| Range | 4130 |
| Interquartile range (IQR) | 560 |
Descriptive statistics
| Standard deviation | 440.1041734 |
|---|---|
| Coefficient of variation (CV) | 1.519194903 |
| Kurtosis | 2.328653533 |
| Mean | 289.6956622 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.549080927 |
| Sum | 5008838 |
| Variance | 193691.6834 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 10524 | 60.9% | |
| 600 | 182 | 1.1% | |
| 700 | 174 | 1.0% | |
| 500 | 172 | 1.0% | |
| 800 | 159 | 0.9% | |
| 400 | 147 | 0.9% | |
| 900 | 118 | 0.7% | |
| 300 | 112 | 0.6% | |
| 1000 | 111 | 0.6% | |
| 530 | 90 | 0.5% | |
| Other values (274) | 5501 | 31.8% |
| Value | Count | Frequency (%) | |
| 0 | 10524 | 60.9% | |
| 10 | 1 | < 0.1% | |
| 40 | 3 | < 0.1% | |
| 50 | 7 | < 0.1% | |
| 60 | 8 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4130 | 1 | < 0.1% | |
| 3500 | 1 | < 0.1% | |
| 3480 | 1 | < 0.1% | |
| 3000 | 1 | < 0.1% | |
| 2810 | 1 | < 0.1% |
yr_built
Real number (ℝ≥0)
| Distinct | 116 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1970.821168 |
|---|---|
| Minimum | 1900 |
| Maximum | 2015 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 1900 |
|---|---|
| 5-th percentile | 1915 |
| Q1 | 1951 |
| median | 1975 |
| Q3 | 1997 |
| 95-th percentile | 2010 |
| Maximum | 2015 |
| Range | 115 |
| Interquartile range (IQR) | 46 |
Descriptive statistics
| Standard deviation | 29.51234269 |
|---|---|
| Coefficient of variation (CV) | 0.01497464263 |
| Kurtosis | -0.6807703274 |
| Mean | 1970.821168 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | -0.4653341188 |
| Sum | 34075498 |
| Variance | 870.9783708 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2014 | 442 | 2.6% | |
| 2006 | 383 | 2.2% | |
| 2005 | 371 | 2.1% | |
| 2004 | 345 | 2.0% | |
| 2003 | 342 | 2.0% | |
| 1977 | 330 | 1.9% | |
| 2007 | 327 | 1.9% | |
| 1978 | 310 | 1.8% | |
| 1968 | 308 | 1.8% | |
| 1967 | 283 | 1.6% | |
| Other values (106) | 13849 | 80.1% |
| Value | Count | Frequency (%) | |
| 1900 | 70 | 0.4% | |
| 1901 | 24 | 0.1% | |
| 1902 | 20 | 0.1% | |
| 1903 | 38 | 0.2% | |
| 1904 | 40 | 0.2% |
| Value | Count | Frequency (%) | |
| 2015 | 32 | 0.2% | |
| 2014 | 442 | 2.6% | |
| 2013 | 150 | 0.9% | |
| 2012 | 136 | 0.8% | |
| 2011 | 100 | 0.6% |
| Distinct | 66 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82.77362637 |
|---|---|
| Minimum | 0 |
| Maximum | 2015 |
| Zeros | 16573 |
| Zeros (%) | 95.9% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 2015 |
| Range | 2015 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 397.9776168 |
|---|---|
| Coefficient of variation (CV) | 4.808024418 |
| Kurtosis | 19.16917439 |
| Mean | 82.77362637 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.600593497 |
| Sum | 1431156 |
| Variance | 158386.1835 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 16573 | 95.9% | |
| 2014 | 72 | 0.4% | |
| 2000 | 31 | 0.2% | |
| 2003 | 28 | 0.2% | |
| 2013 | 27 | 0.2% | |
| 2007 | 26 | 0.2% | |
| 2005 | 24 | 0.1% | |
| 2006 | 21 | 0.1% | |
| 2004 | 20 | 0.1% | |
| 1990 | 19 | 0.1% | |
| Other values (56) | 449 | 2.6% |
| Value | Count | Frequency (%) | |
| 0 | 16573 | 95.9% | |
| 1940 | 2 | < 0.1% | |
| 1944 | 1 | < 0.1% | |
| 1945 | 3 | < 0.1% | |
| 1946 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2015 | 15 | 0.1% | |
| 2014 | 72 | 0.4% | |
| 2013 | 27 | 0.2% | |
| 2012 | 9 | 0.1% | |
| 2011 | 7 | < 0.1% |
zipcode
Real number (ℝ≥0)
| Distinct | 70 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 98077.8251 |
|---|---|
| Minimum | 98001 |
| Maximum | 98199 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 98001 |
|---|---|
| 5-th percentile | 98004 |
| Q1 | 98033 |
| median | 98065 |
| Q3 | 98117 |
| 95-th percentile | 98177 |
| Maximum | 98199 |
| Range | 198 |
| Interquartile range (IQR) | 84 |
Descriptive statistics
| Standard deviation | 53.30329337 |
|---|---|
| Coefficient of variation (CV) | 0.0005434795614 |
| Kurtosis | -0.8525081386 |
| Mean | 98077.8251 |
| Median Absolute Deviation (MAD) | 42 |
| Skewness | 0.401199936 |
| Sum | 1695765596 |
| Variance | 2841.241084 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 98115 | 484 | 2.8% | |
| 98117 | 481 | 2.8% | |
| 98038 | 472 | 2.7% | |
| 98052 | 470 | 2.7% | |
| 98103 | 469 | 2.7% | |
| 98034 | 440 | 2.5% | |
| 98042 | 434 | 2.5% | |
| 98118 | 431 | 2.5% | |
| 98023 | 399 | 2.3% | |
| 98133 | 394 | 2.3% | |
| Other values (60) | 12816 | 74.1% |
| Value | Count | Frequency (%) | |
| 98001 | 277 | 1.6% | |
| 98002 | 161 | 0.9% | |
| 98003 | 236 | 1.4% | |
| 98004 | 258 | 1.5% | |
| 98005 | 126 | 0.7% |
| Value | Count | Frequency (%) | |
| 98199 | 249 | 1.4% | |
| 98198 | 215 | 1.2% | |
| 98188 | 108 | 0.6% | |
| 98178 | 206 | 1.2% | |
| 98177 | 202 | 1.2% |
lat
Real number (ℝ≥0)
| Distinct | 4832 |
|---|---|
| Distinct (%) | 27.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 47.56018304 |
|---|---|
| Minimum | 47.1647 |
| Maximum | 47.7776 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 47.1647 |
|---|---|
| 5-th percentile | 47.3109 |
| Q1 | 47.4713 |
| median | 47.572 |
| Q3 | 47.6783 |
| 95-th percentile | 47.7494 |
| Maximum | 47.7776 |
| Range | 0.6129 |
| Interquartile range (IQR) | 0.207 |
Descriptive statistics
| Standard deviation | 0.1385656812 |
|---|---|
| Coefficient of variation (CV) | 0.002913480822 |
| Kurtosis | -0.6773150498 |
| Mean | 47.56018304 |
| Median Absolute Deviation (MAD) | 0.1049 |
| Skewness | -0.4888071489 |
| Sum | 822315.5647 |
| Variance | 0.019200448 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 47.5322 | 16 | 0.1% | |
| 47.5533 | 14 | 0.1% | |
| 47.6886 | 13 | 0.1% | |
| 47.6846 | 13 | 0.1% | |
| 47.5491 | 13 | 0.1% | |
| 47.6904 | 13 | 0.1% | |
| 47.7145 | 12 | 0.1% | |
| 47.5427 | 12 | 0.1% | |
| 47.6914 | 12 | 0.1% | |
| 47.6754 | 12 | 0.1% | |
| Other values (4822) | 17160 | 99.2% |
| Value | Count | Frequency (%) | |
| 47.1647 | 1 | < 0.1% | |
| 47.1776 | 2 | < 0.1% | |
| 47.1795 | 1 | < 0.1% | |
| 47.1803 | 1 | < 0.1% | |
| 47.1808 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 47.7776 | 3 | < 0.1% | |
| 47.7775 | 2 | < 0.1% | |
| 47.7774 | 1 | < 0.1% | |
| 47.7772 | 3 | < 0.1% | |
| 47.7771 | 1 | < 0.1% |
long
Real number (ℝ)
| Distinct | 735 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -122.2136619 |
|---|---|
| Minimum | -122.514 |
| Maximum | -121.315 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | -122.514 |
|---|---|
| 5-th percentile | -122.387 |
| Q1 | -122.328 |
| median | -122.231 |
| Q3 | -122.124 |
| 95-th percentile | -121.978 |
| Maximum | -121.315 |
| Range | 1.199 |
| Interquartile range (IQR) | 0.204 |
Descriptive statistics
| Standard deviation | 0.1413044141 |
|---|---|
| Coefficient of variation (CV) | -0.001156208004 |
| Kurtosis | 1.108999643 |
| Mean | -122.2136619 |
| Median Absolute Deviation (MAD) | 0.101 |
| Skewness | 0.8949016449 |
| Sum | -2113074.214 |
| Variance | 0.01996693744 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| -122.29 | 100 | 0.6% | |
| -122.3 | 90 | 0.5% | |
| -122.362 | 89 | 0.5% | |
| -122.291 | 88 | 0.5% | |
| -122.288 | 84 | 0.5% | |
| -122.372 | 81 | 0.5% | |
| -122.172 | 80 | 0.5% | |
| -122.363 | 80 | 0.5% | |
| -122.375 | 78 | 0.5% | |
| -122.299 | 78 | 0.5% | |
| Other values (725) | 16442 | 95.1% |
| Value | Count | Frequency (%) | |
| -122.514 | 1 | < 0.1% | |
| -122.512 | 1 | < 0.1% | |
| -122.511 | 2 | < 0.1% | |
| -122.509 | 2 | < 0.1% | |
| -122.507 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| -121.315 | 2 | < 0.1% | |
| -121.316 | 1 | < 0.1% | |
| -121.319 | 1 | < 0.1% | |
| -121.321 | 1 | < 0.1% | |
| -121.325 | 1 | < 0.1% |
sqft_living15
Real number (ℝ≥0)
| Distinct | 713 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1986.67941 |
|---|---|
| Minimum | 399 |
| Maximum | 6210 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 399 |
|---|---|
| 5-th percentile | 1140 |
| Q1 | 1490 |
| median | 1840 |
| Q3 | 2360 |
| 95-th percentile | 3300 |
| Maximum | 6210 |
| Range | 5811 |
| Interquartile range (IQR) | 870 |
Descriptive statistics
| Standard deviation | 683.2534721 |
|---|---|
| Coefficient of variation (CV) | 0.3439173269 |
| Kurtosis | 1.437062962 |
| Mean | 1986.67941 |
| Median Absolute Deviation (MAD) | 410 |
| Skewness | 1.079476076 |
| Sum | 34349687 |
| Variance | 466835.3071 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1560 | 158 | 0.9% | |
| 1540 | 156 | 0.9% | |
| 1440 | 151 | 0.9% | |
| 1500 | 140 | 0.8% | |
| 1550 | 135 | 0.8% | |
| 1760 | 134 | 0.8% | |
| 1460 | 133 | 0.8% | |
| 1580 | 133 | 0.8% | |
| 1620 | 131 | 0.8% | |
| 1480 | 131 | 0.8% | |
| Other values (703) | 15888 | 91.9% |
| Value | Count | Frequency (%) | |
| 399 | 1 | < 0.1% | |
| 460 | 2 | < 0.1% | |
| 620 | 1 | < 0.1% | |
| 670 | 1 | < 0.1% | |
| 690 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 6210 | 1 | < 0.1% | |
| 5790 | 5 | < 0.1% | |
| 5380 | 1 | < 0.1% | |
| 5220 | 1 | < 0.1% | |
| 5170 | 1 | < 0.1% |
sqft_lot15
Real number (ℝ≥0)
| Distinct | 7562 |
|---|---|
| Distinct (%) | 43.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12845.61174 |
|---|---|
| Minimum | 651 |
| Maximum | 871200 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 135.1 KiB |
Quantile statistics
| Minimum | 651 |
|---|---|
| 5-th percentile | 1966.35 |
| Q1 | 5100 |
| median | 7600 |
| Q3 | 10078.75 |
| 95-th percentile | 37287.05 |
| Maximum | 871200 |
| Range | 870549 |
| Interquartile range (IQR) | 4978.75 |
Descriptive statistics
| Standard deviation | 28028.1806 |
|---|---|
| Coefficient of variation (CV) | 2.181926495 |
| Kurtosis | 159.9661426 |
| Mean | 12845.61174 |
| Median Absolute Deviation (MAD) | 2500 |
| Skewness | 9.779303265 |
| Sum | 222100627 |
| Variance | 785578907.8 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5000 | 340 | 2.0% | |
| 4000 | 287 | 1.7% | |
| 6000 | 239 | 1.4% | |
| 7200 | 164 | 0.9% | |
| 7500 | 110 | 0.6% | |
| 4800 | 109 | 0.6% | |
| 3600 | 91 | 0.5% | |
| 5100 | 89 | 0.5% | |
| 4500 | 88 | 0.5% | |
| 8000 | 86 | 0.5% | |
| Other values (7552) | 15687 | 90.7% |
| Value | Count | Frequency (%) | |
| 651 | 1 | < 0.1% | |
| 659 | 1 | < 0.1% | |
| 660 | 1 | < 0.1% | |
| 748 | 2 | < 0.1% | |
| 750 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 871200 | 1 | < 0.1% | |
| 858132 | 1 | < 0.1% | |
| 560617 | 1 | < 0.1% | |
| 438213 | 1 | < 0.1% | |
| 434728 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| price | bedrooms | bathrooms | sqft_living | sqft_lot | floors | waterfront | view | condition | grade | sqft_above | sqft_basement | yr_built | yr_renovated | zipcode | lat | long | sqft_living15 | sqft_lot15 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 325000.0 | 3 | 1.75 | 1780 | 11096 | 1.0 | 0 | 0 | 3 | 7 | 1210 | 570 | 1979 | 0 | 98074 | 47.6170 | -122.051 | 1780 | 10640 |
| 1 | 278000.0 | 2 | 2.50 | 1420 | 2229 | 2.0 | 0 | 0 | 3 | 7 | 1420 | 0 | 2004 | 0 | 98059 | 47.4871 | -122.165 | 1500 | 2230 |
| 2 | 710000.0 | 2 | 1.00 | 1790 | 4000 | 1.0 | 0 | 0 | 4 | 7 | 1040 | 750 | 1923 | 0 | 98112 | 47.6405 | -122.301 | 1310 | 4000 |
| 3 | 389900.0 | 4 | 1.00 | 1710 | 117176 | 1.5 | 0 | 0 | 4 | 6 | 1710 | 0 | 1942 | 0 | 98055 | 47.4497 | -122.212 | 1940 | 12223 |
| 4 | 489000.0 | 4 | 1.00 | 1150 | 5217 | 1.5 | 0 | 0 | 3 | 7 | 1150 | 0 | 1951 | 0 | 98115 | 47.6806 | -122.287 | 1220 | 5217 |
| 5 | 440000.0 | 3 | 1.00 | 1710 | 6556 | 1.5 | 0 | 0 | 4 | 7 | 1200 | 510 | 1926 | 0 | 98133 | 47.7185 | -122.354 | 1410 | 6563 |
| 6 | 250750.0 | 5 | 1.75 | 2140 | 12058 | 1.0 | 0 | 0 | 4 | 8 | 2140 | 0 | 1951 | 0 | 98002 | 47.3167 | -122.214 | 1640 | 10125 |
| 7 | 1540000.0 | 5 | 3.25 | 2920 | 6960 | 2.0 | 0 | 1 | 3 | 9 | 2120 | 800 | 1953 | 2008 | 98105 | 47.6712 | -122.272 | 2470 | 6735 |
| 8 | 309950.0 | 1 | 1.00 | 1120 | 11800 | 1.5 | 0 | 0 | 3 | 7 | 1120 | 0 | 1950 | 0 | 98168 | 47.5123 | -122.331 | 2330 | 9290 |
| 9 | 310000.0 | 3 | 2.75 | 2150 | 6576 | 1.0 | 0 | 0 | 4 | 7 | 1900 | 250 | 1926 | 0 | 98155 | 47.7539 | -122.308 | 2150 | 9071 |
Last rows
| price | bedrooms | bathrooms | sqft_living | sqft_lot | floors | waterfront | view | condition | grade | sqft_above | sqft_basement | yr_built | yr_renovated | zipcode | lat | long | sqft_living15 | sqft_lot15 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 17280 | 607500.0 | 4 | 2.50 | 3000 | 8100 | 2.0 | 0 | 0 | 3 | 8 | 3000 | 0 | 1992 | 0 | 98125 | 47.7151 | -122.305 | 1550 | 8100 |
| 17281 | 250000.0 | 3 | 2.00 | 1590 | 8100 | 1.0 | 0 | 0 | 3 | 7 | 1060 | 530 | 1996 | 0 | 98038 | 47.3611 | -122.047 | 1590 | 8100 |
| 17282 | 577500.0 | 3 | 1.75 | 2140 | 13286 | 1.0 | 0 | 0 | 4 | 8 | 1220 | 920 | 1964 | 0 | 98006 | 47.5722 | -122.128 | 2250 | 13286 |
| 17283 | 712000.0 | 3 | 2.00 | 1700 | 5100 | 1.5 | 0 | 0 | 4 | 7 | 1500 | 200 | 1924 | 0 | 98117 | 47.6790 | -122.390 | 1700 | 5100 |
| 17284 | 500000.0 | 4 | 2.00 | 1680 | 3813 | 2.0 | 0 | 0 | 4 | 7 | 1680 | 0 | 1900 | 0 | 98144 | 47.5930 | -122.293 | 2540 | 3996 |
| 17285 | 870000.0 | 4 | 2.75 | 3410 | 23000 | 2.0 | 0 | 0 | 4 | 10 | 3410 | 0 | 1982 | 0 | 98006 | 47.5559 | -122.149 | 2490 | 15512 |
| 17286 | 720000.0 | 4 | 2.50 | 3450 | 39683 | 2.0 | 0 | 0 | 3 | 10 | 3450 | 0 | 2002 | 0 | 98010 | 47.3420 | -122.025 | 3350 | 39750 |
| 17287 | 506000.0 | 3 | 1.75 | 2180 | 7700 | 1.0 | 0 | 0 | 3 | 8 | 1480 | 700 | 1961 | 0 | 98177 | 47.7594 | -122.361 | 2180 | 7604 |
| 17288 | 667000.0 | 5 | 2.00 | 1900 | 5470 | 1.0 | 0 | 0 | 3 | 7 | 1180 | 720 | 1930 | 1965 | 98105 | 47.6666 | -122.303 | 1300 | 3250 |
| 17289 | 480000.0 | 3 | 2.50 | 1250 | 1103 | 3.0 | 0 | 2 | 3 | 8 | 1250 | 0 | 2005 | 0 | 98103 | 47.6619 | -122.352 | 1250 | 1188 |
Most frequent
| price | bedrooms | bathrooms | sqft_living | sqft_lot | floors | waterfront | view | condition | grade | sqft_above | sqft_basement | yr_built | yr_renovated | zipcode | lat | long | sqft_living15 | sqft_lot15 | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 259950.0 | 2 | 2.00 | 1070 | 649 | 2.0 | 0 | 0 | 3 | 9 | 720 | 350 | 2008 | 0 | 98106 | 47.5213 | -122.357 | 1070 | 928 | 2 |
| 1 | 529500.0 | 3 | 2.25 | 1410 | 905 | 3.0 | 0 | 0 | 3 | 9 | 1410 | 0 | 2014 | 0 | 98116 | 47.5818 | -122.402 | 1510 | 1352 | 2 |
| 2 | 555000.0 | 3 | 2.50 | 1940 | 3211 | 2.0 | 0 | 0 | 3 | 8 | 1940 | 0 | 2009 | 0 | 98027 | 47.5644 | -122.093 | 1880 | 3078 | 2 |
| 3 | 585000.0 | 3 | 2.50 | 2290 | 5089 | 2.0 | 0 | 0 | 3 | 9 | 2290 | 0 | 2001 | 0 | 98006 | 47.5443 | -122.172 | 2290 | 7984 | 2 |